Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

support RLE and binary mask #150

Closed
wants to merge 9 commits into from
Closed

Conversation

wangg12
Copy link
Contributor

@wangg12 wangg12 commented Nov 13, 2018

No description provided.

@facebook-github-bot
Copy link

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please sign up at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need the corporate CLA signed.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

Copy link
Contributor

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is starting to look pretty good, thanks!

I have a few comments that I think would be good discussing / addressing.

maskrcnn_benchmark/structures/segmentation_mask.py Outdated Show resolved Hide resolved
maskrcnn_benchmark/structures/segmentation_mask.py Outdated Show resolved Hide resolved
maskrcnn_benchmark/structures/segmentation_mask.py Outdated Show resolved Hide resolved
maskrcnn_benchmark/structures/segmentation_mask.py Outdated Show resolved Hide resolved
@fmassa
Copy link
Contributor

fmassa commented Nov 13, 2018

Also, I think which would be awesome to have would be to add some tests in maskrcnn-benchmark/tests, which we are currently lacking and would make reviewing the changes much easier!

@wangg12
Copy link
Contributor Author

wangg12 commented Nov 13, 2018

@fmassa I think it should be consistent with Detectron now.

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Nov 14, 2018
@facebook-github-bot
Copy link

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks!

@fmassa
Copy link
Contributor

fmassa commented Nov 14, 2018

I'll have another closer look today, but would you mind adding some tests in tests/? It will make things much easier!

@wangg12
Copy link
Contributor Author

wangg12 commented Nov 14, 2018

@fmassa I'm not familiar with how to write the tests.

@fmassa
Copy link
Contributor

fmassa commented Nov 14, 2018

Ok, no worries

Copy link
Contributor

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a few more comments, but it's not a full review yet.

maskrcnn_benchmark/structures/segmentation_mask.py Outdated Show resolved Hide resolved
maskrcnn_benchmark/structures/segmentation_mask.py Outdated Show resolved Hide resolved
@wangg12
Copy link
Contributor Author

wangg12 commented Nov 14, 2018

@fmassa I've added some tests about segmentation_mask. However, the transpose and resize can not pass the test. Some help is needed.

Copy link
Contributor

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need all those files for this test.

Also, can you give me what is the difference between both implementations (the numerical difference) so that I understand a bit better what's the problem?

tests/common_utils.py Outdated Show resolved Hide resolved
tests/test_segmentation_mask.py Outdated Show resolved Hide resolved
@wangg12
Copy link
Contributor Author

wangg12 commented Nov 14, 2018

For resize I can understand there maybe some numerical difference because of the interpolation, but I don't know how to make them equivalent. For transpose the difference is much stranger, I don't know where the difference is.

@fmassa
Copy link
Contributor

fmassa commented Nov 14, 2018

For the resize, I'd check if the difference appears in the boundaries. Also, the values should be 0-1.
For the transpose, I'd see if the difference disappears if you set this TO_REMOVE = 0. That could explain the difference.

Also, visualizing both results in the image space is going to be very helpful.

@wangg12
Copy link
Contributor Author

wangg12 commented Nov 14, 2018

@fmassa For transpose, I changed TO_REMOVE=0 in class Polygons. The difference is smaller but still exists. For resize, I've visualized the result, the difference almost comes from the boundaries.

@fmassa
Copy link
Contributor

fmassa commented Nov 14, 2018

Ok, this is good progress, thanks!

I'd need to look more closely to see where the difference might come from for the transpose, there might be a few pixels off and this can be observed also by viewing the images.

For resize, I'd need to check a bit more carefully to see if I spot anything, but I might not have the time today nor tomorrow.

@JoyHuYY1412
Copy link

@fmassa For transpose, I changed TO_REMOVE=0 in class Polygons. The difference is smaller but still exists. For resize, I've visualized the result, the difference almost comes from the boundaries.

If I use the segmentation_mask.py you wrote, are there other changes I should apply on other files?
It looks like the functions and ports fit the original code well.
What's more, are there other things I should take attention? Maybe I can try the transforms without transposing first? >< Thank you!

@fmassa
Copy link
Contributor

fmassa commented Jan 22, 2019

@JoyHuYY1412 you'd need to check the interpolation (to use bilinear instead of nearest), apart from that, the rest should be unchanged (or almost).

@txytju do you mind finishing this PR, given that you managed to make it work for your case?

@JoyHuYY1412
Copy link

@JoyHuYY1412 you'd need to check the interpolation (to use bilinear instead of nearest), apart from that, the rest should be unchanged (or almost).
Thank you~

@IssamLaradji
Copy link

I like this!

@IssamLaradji
Copy link

IssamLaradji commented Jan 25, 2019

I faced a problem with cropped_mask = self.mask[box[1]: box[3], box[0]: box[2]] in the part of the code below where I had box[0] and box[2] equal, resulting in width=0, causing an error.

def crop(self, box):
        box = [int(b) for b in box]
        w, h = box[2] - box[0], box[3] - box[1]
        w = max(w, 1)
        h = max(h, 1)
        cropped_mask = self.mask[box[1]: box[3], box[0]: box[2]]
        return Mask(cropped_mask, size=(w, h), mode=self.mode)

This happened because I called cropped_mask = segmentation_mask.crop(proposal) where proposal is tensor([610.0664, 258.8555, 610.7168, 269.9121]) which got rounded to tensor([610, 258, 610, 269])

@wangg12
Copy link
Contributor Author

wangg12 commented Jan 26, 2019

@IssamLaradji I think it should be w, h = box[2] - box[0] + 1, box[3] - box[1] + 1.

And I guess round is more suitable than int?

@fmassa
Copy link
Contributor

fmassa commented Feb 18, 2019

@botcs if you could write unit tests for this PR, it would be awesome!

@botcs
Copy link
Contributor

botcs commented Feb 18, 2019

using the available test_segmentation_mask, I have tried to visualize the Polygon and the Mask tests, and here is a notebook of my observations:
https://gist.github.com/botcs/95176d877dcd26e48e46cceecdac5763

@fmassa
Copy link
Contributor

fmassa commented Feb 19, 2019

@botcs that's awesome, thanks for the notebook!
So indeed we have the boundary effects as before.

One last thing I'd like to do to verify if this boundary effects actually matter is to run a training with https://github.com/facebookresearch/maskrcnn-benchmark/blob/master/configs/quick_schedules/e2e_mask_rcnn_R_50_FPN_quick.yaml , which is a fast training and should take ~10 min to run, using the Mask class instead of Polygons.

For that, I'd (locally, this is not to be commited) modify

masks = [obj["segmentation"] for obj in anno]
masks = SegmentationMask(masks, img.size)
target.add_field("masks", masks)

so that it uses Mask instead.

I'd expect to have as results something like

EXPECTED_RESULTS: [['coco_2014_minival', 'box', 'AP', [0.082300, 0.001682]], ['coco_2014_minival', 'mask', 'AP', [0.075039, 0.001872]]]

, so box AP should be around 8.2, and mask AP around 7.5

Could you do that?

Thanks!

s += "image_height={}, ".format(self.size[1])
s += "mode={})".format(self.mode)
return s



class Polygons(object):
Copy link
Contributor

@botcs botcs Feb 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the mode field for the Polygon is completely irrelevant.
It is never used, but causes:

  • additional argument passing when constructing
  • a headache when trying to find out what Polygon really is

Question:
Wouldn't it be more consistent if the convert method of a Polygon would be renamed to convert_to_mask and would return a Mask instance?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

I agree with your points.

The reason why this is currently the case was that I wanted to keep the same interface between Polygons and Box (which is not implemented, but is the single-box equivalent of BoxList).
And my original idea was that we would be able to specify what was the underlying type of the data via the mode: is it a polygon, or a mask?

I'm not sure about changing the convert name of the method though.

In general, I think both box_list and segmentation_mask could benefit from some better design / cleanup, but I'm not sure what that would be

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And my original idea was that we would be able to specify what was the underlying type of the data via the mode: is it a polygon, or a mask?

I think I cannot follow this part:

To specify what was the underlying type of the data via the mode

As in the current implementation, a Polygon instance:

  1. can be initialized either with a list of polygons
  2. can be initialized either with a Polygon instance (which is referenced now, but should be hard-copied IMO)
  3. cannot be initialized with a Mask, which feature could be added if necessary (I am doing this to convert GTA binary masks to COCO Polygon format, but only because the binary masks are not supported).

So the underlying data would be specified: Polygon.

On the other hand, about the convert function:

I'm not sure about changing the convert name of the method though.

  1. The convert function takes an argument for the target mode, but actually it accepts just a single answer, which is odd.
  2. If I assume that a Polygon can be only convert-ed to a Mask than convert name is OK, but relies on the assumption that the data can be represented either in Polygons or Masks and nothing else, which is not necessary a trivial assumption, so changing the name to convert_to_mask would be clear from the very first encounter.

keep the same interface

  1. We should in this case add the convert or convert_to_polygon method to the Mask class as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are all reasonable points, and I'm willing to accept PRs that improve the overall consistency and software design of the codebase

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fmassa Thanks, these points were considered for the refactored version, PR #473

@botcs
Copy link
Contributor

botcs commented Feb 19, 2019

@botcs that's awesome, thanks for the notebook!
So indeed we have the boundary effects as before.

One last thing I'd like to do to verify if this boundary effects actually matter is to run a training with https://github.com/facebookresearch/maskrcnn-benchmark/blob/master/configs/quick_schedules/e2e_mask_rcnn_R_50_FPN_quick.yaml , which is a fast training and should take ~10 min to run, using the Mask class instead of Polygons.

For that, I'd (locally, this is not to be commited) modify

maskrcnn-benchmark/maskrcnn_benchmark/data/datasets/coco.py

Lines 82 to 84 in f8b0118
masks = [obj["segmentation"] for obj in anno]
masks = SegmentationMask(masks, img.size)
target.add_field("masks", masks)

so that it uses Mask instead.

I'd expect to have as results something like

EXPECTED_RESULTS: [['coco_2014_minival', 'box', 'AP', [0.082300, 0.001682]], ['coco_2014_minival', 'mask', 'AP', [0.075039, 0.001872]]]

, so box AP should be around 8.2, and mask AP around 7.5

Could you do that?

Thanks!

I have trained the model, which went fine, but the evaluation has failed with the following error:

  File "/home/csbotos/anaconda3/envs/debugmask/lib/python3.7/site-packages/maskrcnn_benchmark-0.1-py3.7-linux-x86_64.egg/maskrcnn_benchmark/structures/segmentation_mask.py", line 206, in __init__
    if not isinstance(segms[0], (list, Polygons)):
IndexError: list index out of range

the whole output can be foun at this gist

@botcs
Copy link
Contributor

botcs commented Feb 19, 2019

I have suppressed the error by using single empty instance with an all-zero mask when the provided segms is an empty list. And the results are the following: box AP is 6.0 and mask AP is 5.9

It is quite below the expected performance, and I now rerun the training to see if it is still the case.

@botcs
Copy link
Contributor

botcs commented Feb 19, 2019

So I was curious about the expected results in @fmassa 's comment if they were correct, and I have ran the training using Polygons and Mask and the following results came back, two runs each:

  • Polygon box AP is 4.4 and mask AP is 4.3
  • Mask box AP is 6.0 and mask AP is 5.9

Which means that the Mask is doing better than the Polygon and the expected results for the training script were higher: bbox 8.2, mask 7.5

@JoyHuYY1412
Copy link

@IssamLaradji I think it should be w, h = box[2] - box[0] + 1, box[3] - box[1] + 1.

And I guess round is more suitable than int?

In the code, should be round(float(b)) ?

@botcs
Copy link
Contributor

botcs commented Feb 21, 2019

Hi guys,

A few days ago @fmassa mentioned in one of his comments the following:

In general, I think both box_list and segmentation_mask could benefit from some better design / cleanup, but I'm not sure what that would be

So I tried to reshape things a bit to accommodate better the different requirements, but it has radically changed a few concepts. I would be keen to learn about your opinions on it: #473

@IssamLaradji
Copy link

I am getting this with this code,

  File "/maskrcnn-benchmark/maskrcnn_benchmark/modeling/roi_heads/mask_head/loss.py", line 39, in project_masks_on_boxes 
    scaled_mask = cropped_mask.resize((M, M)) 
  File "/mnt/home/issam/Research_Ground/domain_adaptation/ann_utils.py", line 282, in resize 
    self.mask[None, None, :, :], (height, width), mode="bilinear" 
  File "/miniconda/envs/py36/lib/python3.6/site-packages/torch/nn/functional.py", line 2447, in interpolate 
    return torch._C._nn.upsample_bilinear2d(input, _output_size(2), align_corners) 
RuntimeError: invalid argument 2: input and output sizes should be greater than 0, but got input (H: 2, W: 0) output (H: 28, W: 28) at /pytorch/aten/src/THNN/generic/SpatialUpSamplingBilinear.c:19 
Uncaught exception. Entering post mortem debugging 
Running 'cont' or 'step' will restart the program 

self.mask = tensor([], size=(2, 0))

@fmassa
Copy link
Contributor

fmassa commented Feb 28, 2019

@IssamLaradji does this also happen with #473 ?

@jefequien
Copy link

jefequien commented Apr 2, 2019

@IssamLaradji I think it should be box = [max(round(float(b)), 0) for b in box]. The bad input size comes from a crop when a box coordinate somehow becomes -1.

@wangg12 wangg12 closed this Apr 9, 2019
@fmassa
Copy link
Contributor

fmassa commented Apr 9, 2019

Thanks for the initial work on this PR @wangg12 ! This has been merged in #473

@IssamLaradji
Copy link

IssamLaradji commented Apr 9, 2019

Does this work yet for RLE? I am getting this when I pass a list of RLEs.

File "/maskrcnn-benchmark/maskrcnn_benchmark/structures/segmentation_mask.py", line 73, in __init__
    if len(masks.shape) == 2:
AttributeError: 'list' object has no attribute 'shape'

@ShihuaiXu
Copy link

AttributeError: 'list' object has no attribute 'shape'
I met the same problem!

@botcs
Copy link
Contributor

botcs commented Jul 6, 2019

Hi @ShihuaiXu ,
Please visit the updates here

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CLA Signed Do not delete this pull request or issue due to inactivity.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants